Coalesce Expression
Coalesce
is a Catalyst expression to represent coalesce standard function or SQL’s coalesce function in structured queries.
When created, Coalesce
takes Catalyst expressions (as the children).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
import org.apache.spark.sql.catalyst.expressions.Coalesce // Use Catalyst DSL import org.apache.spark.sql.catalyst.dsl.expressions._ import org.apache.spark.sql.functions.lit val coalesceExpr = Coalesce(children = Seq(lit(null).expr % 1, lit(null).expr, 1d)) scala> println(coalesceExpr.numberedTreeString) 00 coalesce((null % 1), null, 1.0) 01 :- (null % 1) 02 : :- null 03 : +- 1 04 :- null 05 +- 1.0 |
Caution
|
FIXME Describe FunctionArgumentConversion and Coalesce |
Spark Optimizer uses NullPropagation logical optimization to remove null
literals (in the children expressions). That could result in a static evaluation that gives null
value if all children expressions are null
literals.
1 2 3 4 5 6 7 8 |
// FIXME // Demo Coalesce with nulls only // Demo Coalesce with null and non-null expressions that are optimized to one expression (in NullPropagation) // Demo Coalesce with non-null expressions after NullPropagation optimization |
Coalesce
is also created when:
-
Analyzer
is requested to commonNaturalJoinProcessing forFullOuter
join type -
RewriteDistinctAggregates
logical optimization is requested torewrite
-
ExtractEquiJoinKeys
Scala extractor is requested to destructure a logical plan -
ColumnStat
is requested to statExprs -
IfNull
expression is created -
Nvl
expression is created -
Whenever
Cast
expression is used in Catalyst expressions (e.g.Average
,Sum
)