关注 spark技术分享,
撸spark源码 玩spark最佳实践

UnresolvedGenerator

admin阅读(1305)

UnresolvedGenerator Expression

UnresolvedGenerator is a Generator that represents an unresolved generator in a logical query plan.

UnresolvedGenerator is created exclusively when AstBuilder is requested to withGenerate (as part of Generate logical operator) for SQL’s LATERAL VIEW (in SELECT or FROM clauses).

UnresolvedGenerator can never be resolved (and is replaced at analysis phase).


Given UnresolvedGenerator can never be resolved it should not come as a surprise that it cannot be evaluated either (i.e. produce a value given an internal row). When requested to evaluate, UnresolvedGenerator simply reports a UnsupportedOperationException.

Note

UnresolvedGenerator is resolved to a concrete Generator expression when ResolveFunctions logical resolution rule is executed.

Note
UnresolvedGenerator is similar to UnresolvedFunction and differs mostly by the type (to make Spark development with Scala easier?)

Creating UnresolvedGenerator Instance

UnresolvedGenerator takes the following when created:

UnresolvedFunction

admin阅读(1704)

UnresolvedFunction Unevaluable Expression — Logical Representation of Functions in Queries

UnresolvedFunction is an Catalyst expression that represents a function (application) in a logical query plan.

UnresolvedFunction is created as a result of the following:

UnresolvedFunction can never be resolved (and is replaced at analysis phase).

Note
UnresolvedFunction is first looked up in LookupFunctions logical rule and then resolved in ResolveFunctions logical resolution rule.


Given UnresolvedFunction can never be resolved it should not come as a surprise that it cannot be evaluated either (i.e. produce a value given an internal row). When requested to evaluate, UnresolvedFunction simply reports a UnsupportedOperationException.

Note
Unevaluable expressions are expressions that have to be replaced by some other expressions during analysis or optimization (or they fail analysis).
Tip
Use Catalyst DSL’s function or distinctFunction to create a UnresolvedFunction with isDistinct flag off and on, respectively.

Creating UnresolvedFunction (With Database Undefined) — apply Factory Method

apply creates a FunctionIdentifier with the name and no database first and then creates a UnresolvedFunction with the FunctionIdentifier, children and isDistinct flag.

Note

apply is used when:

Creating UnresolvedFunction Instance

UnresolvedFunction takes the following when created:

UnresolvedAttribute

admin阅读(2528)

UnresolvedAttribute Leaf Expression

UnresolvedAttribute is a named Attribute leaf expression (i.e. it has a name) that represents a reference to an entity in a logical query plan.

UnresolvedAttribute is created when:

UnresolvedAttribute can never be resolved (and is replaced at analysis phase).

Note

UnresolvedAttribute is resolved when Analyzer is executed by the following logical resolution rules:


Given UnresolvedAttribute can never be resolved it should not come as a surprise that it cannot be evaluated either (i.e. produce a value given an internal row). When requested to evaluate, UnresolvedAttribute simply reports a UnsupportedOperationException.

UnresolvedAttribute takes name parts when created.

UnresolvedAttribute can be created with a fully-qualified name with dots to separate name parts.

Tip

Use backticks () around names with dots (.) to disable them as separators.

The following is a two-part attribute name with a.b and c name parts.

UnresolvedAttribute can also be created without the dots with the special meaning.

Note

Catalyst DSL defines two Scala implicits to create an UnresolvedAttribute:

  • StringToAttributeConversionHelper is a Scala implicit class that converts $"colName" into an UnresolvedAttribute

  • symbolToUnresolvedAttribute is a Scala implicit method that converts 'colName into an UnresolvedAttribute

Both implicits are part of ExpressionConversions Scala trait of Catalyst DSL.

Import expressions object to get access to the expression conversions.

Note

A UnresolvedAttribute can be replaced by (resolved) a NamedExpression using an analyzed logical plan (of the structured query the attribute is part of).

UnixTimestamp

admin阅读(2253)

UnixTimestamp TimeZoneAware Binary Expression

UnixTimestamp is a binary expression with timezone support that represents unix_timestamp function (and indirectly to_date and to_timestamp).

Note
UnixTimestamp is UnixTime expression internally (as is ToUnixTimestamp expression).


UnixTimestamp supports StringType, DateType and TimestampType as input types for a time expression and returns LongType.

UnixTimestamp uses DateTimeUtils.newDateFormat for date/time format (as Java’s java.text.DateFormat).

UnaryExpression Contract

admin阅读(1912)

UnaryExpression Contract

UnaryExpression is…​FIXME

defineCodeGen Method

defineCodeGen…​FIXME

Note
defineCodeGen is used when…​FIXME

nullSafeEval Method

nullSafeEval simply fails with the following error (and is expected to be overrided to save null-check code):

Note
nullSafeEval is used exclusively when UnaryExpression is requested to eval.

Evaluating Expression — eval Method

Note
eval is part of Expression Contract for the interpreted (non-code-generated) expression evaluation, i.e. evaluating a Catalyst expression to a JVM object for a given internal binary row.

eval…​FIXME

TypedImperativeAggregate

admin阅读(1973)

TypedImperativeAggregate — Contract for Imperative Aggregate Functions with Custom Aggregation Buffer

TypedImperativeAggregate is the contract for imperative aggregation functions that allows for an arbitrary user-defined java object to be used as internal aggregation buffer.

Table 1. TypedImperativeAggregate as ImperativeAggregate
ImperativeAggregate Method Description

aggBufferAttributes

aggBufferSchema

initialize

Creates an aggregation buffer and puts it at mutableAggBufferOffset position in the input buffer InternalRow.

inputAggBufferAttributes

Table 2. TypedImperativeAggregate’s Direct Implementations
Name Description

ApproximatePercentile

Collect

ComplexTypedAggregateExpression

CountMinSketchAgg

HiveUDAFFunction

Percentile

TypedImperativeAggregate Contract

Table 3. TypedImperativeAggregate Contract
Method Description

createAggregationBuffer

Used exclusively when a TypedImperativeAggregate is initialized

deserialize

eval

merge

serialize

update

TypedAggregateExpression

admin阅读(1422)

TypedAggregateExpression Expression

TypedAggregateExpression is the contract for AggregateFunction expressions that…​FIXME

TypedAggregateExpression is used when:

Table 1. TypedAggregateExpression Contract
Method Description

aggregator

Aggregator

inputClass

Used when…​FIXME

inputDeserializer

Used when…​FIXME

inputSchema

Used when…​FIXME

withInputInfo

Used when…​FIXME

Table 2. TypedAggregateExpressions
Aggregator Description

ComplexTypedAggregateExpression

SimpleTypedAggregateExpression

Creating TypedAggregateExpression — apply Factory Method

apply…​FIXME

Note
apply is used exclusively when Aggregator is requested to convert itself to a TypedColumn.

SubqueryExpression

admin阅读(1788)

SubqueryExpression Contract — Expressions With Logical Query Plans

SubqueryExpression is the contract for expressions with logical query plans (i.e. PlanExpression[LogicalPlan]).

Table 1. (Subset of) SubqueryExpression Contract
Method Description

withNewPlan

Used when:

Table 2. SubqueryExpressions
SubqueryExpression Description

Exists

ListQuery

ScalarSubquery

SubqueryExpression is resolved when the children are resolved and the subquery logical plan is resolved.

references…​FIXME

semanticEquals…​FIXME

canonicalize…​FIXME

hasInOrExistsSubquery Object Method

hasInOrExistsSubquery…​FIXME

Note
hasInOrExistsSubquery is used when…​FIXME

hasCorrelatedSubquery Object Method

hasCorrelatedSubquery…​FIXME

Note
hasCorrelatedSubquery is used when…​FIXME

hasSubquery Object Method

hasSubquery…​FIXME

Note
hasSubquery is used when…​FIXME

Creating SubqueryExpression Instance

SubqueryExpression takes the following when created:

StaticInvoke

admin阅读(1631)

StaticInvoke Non-SQL Expression

StaticInvoke is an expression with no SQL representation that represents a static method call in Scala or Java.

StaticInvoke supports Java code generation (aka whole-stage codegen) to evaluate itself.

StaticInvoke is created when:

  • ScalaReflection is requested for the deserializer or serializer for a Scala type

  • RowEncoder is requested for deserializerFor or serializer for a Scala type

  • JavaTypeInference is requested for deserializerFor or serializerFor

Note
StaticInvoke is similar to CallMethodViaReflection expression.

Creating StaticInvoke Instance

StaticInvoke takes the following when created:

  • Target object of the static call

  • Data type of the return value of the method

  • Name of the method to call on the static object

  • Optional expressions to pass as input arguments to the function

  • Flag to control whether to propagate nulls or not (enabled by default). If any of the arguments is null, null is returned instead of calling the function

Star

admin阅读(1606)

Star Expression Contract

Star is a contract of leaf and named expressions that…​FIXME

Table 1. Star Contract
Method Description

expand

Used exclusively when ResolveReferences logical resolution rule is requested to expand Star expressions in the following logical operators:

Table 2. Stars
Star Description

ResolvedStar

UnresolvedRegex

UnresolvedStar

关注公众号:spark技术分享

联系我们联系我们