Coalesce Expression
Coalesce is a Catalyst expression to represent coalesce standard function or SQL’s coalesce function in structured queries.
When created, Coalesce takes Catalyst expressions (as the children).
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
import org.apache.spark.sql.catalyst.expressions.Coalesce // Use Catalyst DSL import org.apache.spark.sql.catalyst.dsl.expressions._ import org.apache.spark.sql.functions.lit val coalesceExpr = Coalesce(children = Seq(lit(null).expr % 1, lit(null).expr, 1d)) scala> println(coalesceExpr.numberedTreeString) 00 coalesce((null % 1), null, 1.0) 01 :- (null % 1) 02 : :- null 03 : +- 1 04 :- null 05 +- 1.0 |
|
Caution
|
FIXME Describe FunctionArgumentConversion and Coalesce |
Spark Optimizer uses NullPropagation logical optimization to remove null literals (in the children expressions). That could result in a static evaluation that gives null value if all children expressions are null literals.
|
1 2 3 4 5 6 7 8 |
// FIXME // Demo Coalesce with nulls only // Demo Coalesce with null and non-null expressions that are optimized to one expression (in NullPropagation) // Demo Coalesce with non-null expressions after NullPropagation optimization |
Coalesce is also created when:
-
Analyzeris requested to commonNaturalJoinProcessing forFullOuterjoin type -
RewriteDistinctAggregateslogical optimization is requested torewrite -
ExtractEquiJoinKeysScala extractor is requested to destructure a logical plan -
ColumnStatis requested to statExprs -
IfNullexpression is created -
Nvlexpression is created -
Whenever
Castexpression is used in Catalyst expressions (e.g.Average,Sum)
spark技术分享