关注 spark技术分享,
撸spark源码 玩spark最佳实践

Row

Row

Row is a generic row object with an ordered collection of fields that can be accessed by an ordinal / an index (aka generic access by ordinal), a name (aka native primitive access) or using Scala’s pattern matching.

Note
Row is also called Catalyst Row.

Row may have an optional schema.

The traits of Row:

  • length or sizeRow knows the number of elements (columns).

  • schemaRow knows the schema

Row belongs to org.apache.spark.sql.Row package.

Creating Row — apply Factory Method

Caution
FIXME

Field Access by Index — apply and get methods

Fields of a Row instance can be accessed by index (starting from 0) using apply or get.

Note
Generic access by ordinal (using apply or get) returns a value of type Any.

Get Field As Type — getAs method

You can query for fields with their proper types using getAs with an index

Note

FIXME

Schema

A Row instance can have a schema defined.

Note
Unless you are instantiating Row yourself (using Row Object), a Row has always a schema.
Note
It is RowEncoder to take care of assigning a schema to a Row when toDF on a Dataset or when instantiating DataFrame through DataFrameReader.

Row Object

Row companion object offers factory methods to create Row instances from a collection of elements (apply), a sequence of elements (fromSeq) and tuples (fromTuple).

Row object can merge Row instances.

It can also return an empty Row instance.

Pattern Matching on Row

Row can be used in pattern matching (since Row Object comes with unapplySeq).

赞(1) 打赏
未经允许不得转载:spark技术分享 » Row
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏