DecimalAggregates Logical Optimization
DecimalAggregates
is a base logical optimization that transforms Sum
and Average
aggregate functions on fixed-precision DecimalType
values to use UnscaledValue
(unscaled Long) values in WindowExpression and AggregateExpression expressions.
DecimalAggregates
is part of the Decimal Optimizations fixed-point batch in the standard batches of the Catalyst Optimizer.
DecimalAggregates
is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan]
.
Tip
|
Import
|
Example: sum Aggregate Function on Decimal with Precision Smaller Than 9
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
// sum aggregate with Decimal field with precision <= 8 val q = "SELECT sum(cast(id AS DECIMAL(5,0))) FROM range(1)" scala> sql(q).explain(true) == Parsed Logical Plan == 'Project [unresolvedalias('sum(cast('id as decimal(5,0))), None)] +- 'UnresolvedTableValuedFunction range, [1] == Analyzed Logical Plan == sum(CAST(id AS DECIMAL(5,0))): decimal(15,0) Aggregate [sum(cast(id#104L as decimal(5,0))) AS sum(CAST(id AS DECIMAL(5,0)))#106] +- Range (0, 1, step=1, splits=None) == Optimized Logical Plan == Aggregate [MakeDecimal(sum(UnscaledValue(cast(id#104L as decimal(5,0)))),15,0) AS sum(CAST(id AS DECIMAL(5,0)))#106] +- Range (0, 1, step=1, splits=None) == Physical Plan == *HashAggregate(keys=[], functions=[sum(UnscaledValue(cast(id#104L as decimal(5,0))))], output=[sum(CAST(id AS DECIMAL(5,0)))#106]) +- Exchange SinglePartition +- *HashAggregate(keys=[], functions=[partial_sum(UnscaledValue(cast(id#104L as decimal(5,0))))], output=[sum#108L]) +- *Range (0, 1, step=1, splits=None) |
Example: avg Aggregate Function on Decimal with Precision Smaller Than 12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
// avg aggregate with Decimal field with precision <= 11 val q = "SELECT avg(cast(id AS DECIMAL(10,0))) FROM range(1)" scala> val q = "SELECT avg(cast(id AS DECIMAL(10,0))) FROM range(1)" q: String = SELECT avg(cast(id AS DECIMAL(10,0))) FROM range(1) scala> sql(q).explain(true) == Parsed Logical Plan == 'Project [unresolvedalias('avg(cast('id as decimal(10,0))), None)] +- 'UnresolvedTableValuedFunction range, [1] == Analyzed Logical Plan == avg(CAST(id AS DECIMAL(10,0))): decimal(14,4) Aggregate [avg(cast(id#115L as decimal(10,0))) AS avg(CAST(id AS DECIMAL(10,0)))#117] +- Range (0, 1, step=1, splits=None) == Optimized Logical Plan == Aggregate [cast((avg(UnscaledValue(cast(id#115L as decimal(10,0)))) / 1.0) as decimal(14,4)) AS avg(CAST(id AS DECIMAL(10,0)))#117] +- Range (0, 1, step=1, splits=None) == Physical Plan == *HashAggregate(keys=[], functions=[avg(UnscaledValue(cast(id#115L as decimal(10,0))))], output=[avg(CAST(id AS DECIMAL(10,0)))#117]) +- Exchange SinglePartition +- *HashAggregate(keys=[], functions=[partial_avg(UnscaledValue(cast(id#115L as decimal(10,0))))], output=[sum#120, count#121L]) +- *Range (0, 1, step=1, splits=None) |
Executing Rule — apply
Method
1 2 3 4 5 |
apply(plan: LogicalPlan): LogicalPlan |
Note
|
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).
|
apply
…FIXME