关注 spark技术分享,
撸spark源码 玩spark最佳实践

ScalaUDF

ScalaUDF — Catalyst Expression to Manage Lifecycle of User-Defined Function

ScalaUDF is a Catalyst expression to manage the lifecycle of a user-defined function (and hook it in to Spark SQL’s Catalyst execution path).

ScalaUDF is a ImplicitCastInputTypes and UserDefinedExpression.

ScalaUDF has no representation in SQL.

ScalaUDF is created when:

Note
Spark SQL Analyzer uses HandleNullInputsForUDF logical evaluation rule to…​FIXME

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — doGenCode Method

Note
doGenCode is part of Expression Contract to generate a Java source code (ExprCode) for code-generated expression evaluation.

doGenCode…​FIXME

Evaluating Expression — eval Method

Note
eval is part of Expression Contract for the interpreted (non-code-generated) expression evaluation, i.e. evaluating a Catalyst expression to a JVM object for a given internal binary row.

eval executes the Scala function on the input internal row.

Creating ScalaUDF Instance

ScalaUDF takes the following when created:

  • A Scala function (as Scala’s AnyRef)

  • Output data type

  • Child Catalyst expressions

  • Input data types (if available)

  • Name (if defined)

  • nullable flag (turned on by default)

  • udfDeterministic flag (turned on by default)

ScalaUDF initializes the internal registries and counters.

赞(0) 打赏
未经允许不得转载:spark技术分享 » ScalaUDF
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏