UserDefinedFunction
UserDefinedFunction
represents a user-defined function.
UserDefinedFunction
is created when:
-
udf function is executed
-
UDFRegistration
is requested to register a Scala function as a user-defined function (inFunctionRegistry
)
1 2 3 4 5 6 7 8 9 10 |
import org.apache.spark.sql.functions.udf scala> val lengthUDF = udf { s: String => s.length } lengthUDF: org.apache.spark.sql.expressions.UserDefinedFunction = UserDefinedFunction(<function1>,IntegerType,Some(List(StringType))) scala> lengthUDF($"name") res1: org.apache.spark.sql.Column = UDF(name) |
UserDefinedFunction
can also have a name.
1 2 3 4 5 6 7 |
val namedLengthUDF = lengthUDF.withName("lengthUDF") scala> namedLengthUDF($"name") res2: org.apache.spark.sql.Column = UDF:lengthUDF(name) |
UserDefinedFunction
is nullable by default, but can be changed as non-nullable.
1 2 3 4 5 6 7 |
val nonNullableLengthUDF = lengthUDF.asNonNullable scala> nonNullableLengthUDF.nullable res1: Boolean = false |
Executing UserDefinedFunction (Creating Column with ScalaUDF Expression) — apply
Method
1 2 3 4 5 |
apply(exprs: Column*): Column |
1 2 3 4 5 6 7 8 9 10 |
import org.apache.spark.sql.functions.udf scala> val lengthUDF = udf { s: String => s.length } lengthUDF: org.apache.spark.sql.expressions.UserDefinedFunction = UserDefinedFunction(<function1>,IntegerType,Some(List(StringType))) scala> lengthUDF($"name") res1: org.apache.spark.sql.Column = UDF(name) |
Note
|
apply is used when…FIXME
|
Marking UserDefinedFunction as NonNullable — asNonNullable
Method
1 2 3 4 5 |
asNonNullable(): UserDefinedFunction |
asNonNullable
…FIXME
Note
|
asNonNullable is used when…FIXME
|
Naming UserDefinedFunction — withName
Method
1 2 3 4 5 |
withName(name: String): UserDefinedFunction |
withName
…FIXME
Note
|
withName is used when…FIXME
|
Creating UserDefinedFunction Instance
UserDefinedFunction
takes the following when created:
-
Output data type
-
Input data types (if available)
UserDefinedFunction
initializes the internal registries and counters.