ExpressionEncoder — Expression-Based Encoder
ExpressionEncoder[T]
is a generic Encoder of JVM objects of the type T
to and from internal binary rows.
ExpressionEncoder[T]
uses expressions for a serializer and a deserializer.
Note
|
ExpressionEncoder is the only supported implementation of Encoder which is explicitly enforced when Dataset is created (even though Dataset data structure accepts a bare Encoder[T] ).
|
ExpressionEncoder
uses serializer expressions to encode (aka serialize) a JVM object of type T
to an internal binary row format (i.e. InternalRow
).
Note
|
It is assumed that all serializer expressions contain at least one and the same BoundReference. |
ExpressionEncoder
uses a deserializer expression to decode (aka deserialize) a JVM object of type T
from internal binary row format.
ExpressionEncoder
is flat when serializer uses a single expression (which also means that the objects of a type T
are not created using constructor parameters only like Product
or DefinedByConstructorParams
types).
Internally, a ExpressionEncoder
creates a UnsafeProjection (for the input serializer), a InternalRow (of size 1
), and a safe Projection
(for the input deserializer). They are all internal lazy attributes of the encoder.
Property | Description |
---|---|
Used exclusively when |
|
Used exclusively when |
|
Used…FIXME |
Note
|
Encoders object contains the default ExpressionEncoders for Scala and Java primitive types, e.g. boolean , long , String , java.sql.Date , java.sql.Timestamp , Array[Byte] .
|
Creating ExpressionEncoder Instance
ExpressionEncoder
takes the following when created:
-
Serializer expressions (to convert objects of type
T
to internal rows) -
Deserializer expression (to convert internal rows to objects of type
T
) -
Scala’s ClassTag for the JVM type
T
Creating Deserialize Expression — ScalaReflection.deserializerFor
Method
deserializerFor
creates an expression to deserialize from internal binary row format to a Scala object of type T
.
Internally, deserializerFor
calls the recursive internal variant of deserializerFor with a single-element walked type path with - root class: "[clsName]"
Tip
|
Read up on Scala’s TypeTags in TypeTags and Manifests.
|
Note
|
deserializerFor is used exclusively when ExpressionEncoder is created for a Scala type T .
|
Recursive Internal deserializerFor
Method
JVM Type (Scala or Java) | Deserialize Expressions |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
User Defined Types (UDTs) |
|
Creating Serialize Expression — ScalaReflection.serializerFor
Method
serializerFor
creates a CreateNamedStruct expression to serialize a Scala object of type T
to internal binary row format.
Internally, serializerFor
calls the recursive internal variant of serializerFor with a single-element walked type path with - root class: "[clsName]"
and pattern match on the result expression.
Caution
|
FIXME the pattern match part |
Tip
|
Read up on Scala’s TypeTags in TypeTags and Manifests.
|
Note
|
serializerFor is used exclusively when ExpressionEncoder is created for a Scala type T .
|
Recursive Internal serializerFor
Method
serializerFor
creates an expression for serializing an object of type T
to an internal row.
Caution
|
FIXME |
Encoding JVM Object to Internal Binary Row Format — toRow
Method
toRow
encodes (aka serializes) a JVM object t
as an internal binary row.
Internally, toRow
sets the only JVM object to be t
in inputRow and converts the inputRow
to a unsafe binary row (using extractProjection).
In case of any exception while serializing, toRow
reports a RuntimeException
:
Note
|
|
Decoding JVM Object From Internal Binary Row Format — fromRow
Method
fromRow
decodes (aka deserializes) a JVM object from a row
InternalRow (with the required values only).
Internally, fromRow
uses constructProjection with row
and gets the 0th element of type ObjectType
that is then cast to the output type T
.
In case of any exception while deserializing, fromRow
reports a RuntimeException
:
Note
|
|
resolveAndBind
Method
resolveAndBind
…FIXME
Note
|
|