InternalRow — Abstract Binary Row Format
Note
|
InternalRow is also called Catalyst row or Spark SQL row.
|
Note
|
UnsafeRow is a concrete InternalRow .
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
// The type of your business objects case class Person(id: Long, name: String) // The encoder for Person objects import org.apache.spark.sql.Encoders val personEncoder = Encoders.product[Person] // The expression encoder for Person objects import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder val personExprEncoder = personEncoder.asInstanceOf[ExpressionEncoder[Person]] // Convert Person objects to InternalRow scala> val row = personExprEncoder.toRow(Person(0, "Jacek")) row: org.apache.spark.sql.catalyst.InternalRow = [0,0,1800000005,6b6563614a] // How many fields are available in Person's InternalRow? scala> row.numFields res0: Int = 2 // Are there any NULLs in this InternalRow? scala> row.anyNull res1: Boolean = false // You can create your own InternalRow objects import org.apache.spark.sql.catalyst.InternalRow scala> val ir = InternalRow(5, "hello", (0, "nice")) ir: org.apache.spark.sql.catalyst.InternalRow = [5,hello,(0,nice)] |
There are methods to create InternalRow
objects using the factory methods in the InternalRow
object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
import org.apache.spark.sql.catalyst.InternalRow scala> InternalRow.empty res0: org.apache.spark.sql.catalyst.InternalRow = [empty row] scala> InternalRow(0, "string", (0, "pair")) res1: org.apache.spark.sql.catalyst.InternalRow = [0,string,(0,pair)] scala> InternalRow.fromSeq(Seq(0, "string", (0, "pair"))) res2: org.apache.spark.sql.catalyst.InternalRow = [0,string,(0,pair)] |