关注 spark技术分享,
撸spark源码 玩spark最佳实践

ScalarSubquery ExecSubqueryExpression

ScalarSubquery (ExecSubqueryExpression) Expression

ScalarSubquery is an ExecSubqueryExpression that can give exactly one value (i.e. the value of executing SubqueryExec subquery that can result in a single row and a single column or null if no row were computed).

Important
Spark SQL uses the name of ScalarSubquery twice to represent an ExecSubqueryExpression (this page) and a SubqueryExpression. It is confusing and you should not be anymore.

ScalarSubquery is created exclusively when PlanSubqueries physical optimization is executed (and plans a ScalarSubquery expression).

ScalarSubquery expression cannot be evaluated, i.e. produce a value given an internal row.

ScalarSubquery uses…​FIXME…​for the data type.

Table 1. ScalarSubquery’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

result

The value of the single column in a single row after collecting the rows from executing the subquery plan or null if no rows were collected.

updated

Flag that says whether ScalarSubquery was updated with collected result of executing the subquery plan.

Creating ScalarSubquery Instance

ScalarSubquery takes the following when created:

Updating ScalarSubquery With Collected Result — updateResult Method

Note
updateResult is part of ExecSubqueryExpression Contract to fill an Catalyst expression with a collected result from executing a subquery plan.

updateResult requests SubqueryExec physical plan to execute and collect internal rows.

updateResult sets result to the value of the only column of the single row or null if no row were collected.

In the end, updateResult marks the ScalarSubquery instance as updated.

updateResult reports a RuntimeException when there are more than 1 rows in the result.

updateResult reports an AssertionError when the number of fields is not exactly 1.

Evaluating Expression — eval Method

Note
eval is part of Expression Contract for the interpreted (non-code-generated) expression evaluation, i.e. evaluating a Catalyst expression to a JVM object for a given internal binary row.

eval simply returns result value.

eval reports an IllegalArgumentException if the ScalarSubquery expression has not been updated yet.

Generating Java Source Code (ExprCode) For Code-Generated Expression Evaluation — doGenCode Method

Note
doGenCode is part of Expression Contract to generate a Java source code (ExprCode) for code-generated expression evaluation.

doGenCode first makes sure that the updated flag is on (true). If not, doGenCode throws an IllegalArgumentException exception with the following message:

doGenCode then creates a Literal (for the result and the dataType) and simply requests it to generate a Java source code.

赞(0) 打赏
未经允许不得转载:spark技术分享 » ScalarSubquery ExecSubqueryExpression
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏