ScalaUDAF — Catalyst Expression Adapter for UserDefinedAggregateFunction
ScalaUDAF
is a Catalyst expression adapter to manage the lifecycle of UserDefinedAggregateFunction and hook it in Spark SQL’s Catalyst execution path.
ScalaUDAF
is created when:
-
UserDefinedAggregateFunction
creates aColumn
for a user-defined aggregate function using all and distinct values (to use the UDAF in Dataset operators) -
UDFRegistration
is requested to register a user-defined aggregate function (to use the UDAF in SQL mode)
ScalaUDAF
is a ImperativeAggregate.
Method Name | Behaviour |
---|---|
Requests UserDefinedAggregateFunction to initialize |
|
Requests UserDefinedAggregateFunction to merge |
|
Requests UserDefinedAggregateFunction to update |
When evaluated, ScalaUDAF
…FIXME
ScalaUDAF
has no representation in SQL.
Name | Description |
---|---|
|
|
|
|
|
|
|
|
|
Copy of aggBufferAttributes |
|
|
|
Always enabled (i.e. |
Name | Description |
---|---|
Used when…FIXME |
|
Used when…FIXME |
|
Used when…FIXME |
|
Used when…FIXME |
Creating ScalaUDAF Instance
ScalaUDAF
takes the following when created:
-
Children Catalyst expressions
ScalaUDAF
initializes the internal registries and counters.
initialize
Method
1 2 3 4 5 |
initialize(buffer: InternalRow): Unit |
initialize
sets the input buffer
internal binary row as underlyingBuffer
of MutableAggregationBufferImpl and requests the UserDefinedAggregateFunction to initialize (with the MutableAggregationBufferImpl).
Note
|
initialize is part of ImperativeAggregate Contract.
|
update
Method
1 2 3 4 5 |
update(mutableAggBuffer: InternalRow, inputRow: InternalRow): Unit |
update
sets the input buffer
internal binary row as underlyingBuffer
of MutableAggregationBufferImpl and requests the UserDefinedAggregateFunction to update.
Note
|
update uses inputProjection on the input input and converts it using inputToScalaConverters.
|
Note
|
update is part of ImperativeAggregate Contract.
|
merge
Method
1 2 3 4 5 |
merge(buffer1: InternalRow, buffer2: InternalRow): Unit |
merge
first sets:
-
underlyingBuffer
of MutableAggregationBufferImpl to the inputbuffer1
-
underlyingInputBuffer
of InputAggregationBuffer to the inputbuffer2
merge
then requests the UserDefinedAggregateFunction to merge (passing in the MutableAggregationBufferImpl and InputAggregationBuffer).
Note
|
merge is part of ImperativeAggregate Contract.
|