关注 spark技术分享,
撸spark源码 玩spark最佳实践

Generator

Generator Contract — Expressions to Generate Zero Or More Rows (aka Lateral Views)

Generator is a contract for Catalyst expressions that can produce zero or more rows given a single input row.

Note
Generator corresponds to SQL’s LATERAL VIEW.

dataType in Generator is simply an ArrayType of elementSchema.

Generator is not foldable and not nullable by default.

Generator supports Java code generation (aka whole-stage codegen) conditionally, i.e. only when a physical operator is not marked as CodegenFallback.

Generator uses terminate to inform that there are no more rows to process, clean up code, and additional rows can be made here.

Table 1. Generators
Name Description

CollectionGenerator

ExplodeBase

Explode

GeneratorOuter

HiveGenericUDTF

Inline

Corresponds to inline and inline_outer functions.

JsonTuple

PosExplode

Stack

UnresolvedGenerator

Represents an unresolved generator.

Created when AstBuilder creates Generate unary logical operator for LATERAL VIEW that corresponds to the following:

Note
UnresolvedGenerator is resolved to Generator by ResolveFunctions logical evaluation rule.

UserDefinedGenerator

Used exclusively in the deprecated explode operator

Note

You can only have one generator per select clause that is enforced by ExtractGenerator logical evaluation rule, e.g.

If you want to have more than one generator in a structured query you should use LATERAL VIEW which is supported in SQL only, e.g.

Generator Contract

Table 2. (Subset of) Generator Contract
Method Description

elementSchema

Schema of the elements to be generated

eval

赞(0) 打赏
未经允许不得转载:spark技术分享 » Generator
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏