关注 spark技术分享,
撸spark源码 玩spark最佳实践

RewritePredicateSubquery

RewritePredicateSubquery Logical Optimization

  • Filter operators with Exists and In with ListQuery expressions give left-semi joins

  • Filter operators with Not with Exists and In with ListQuery expressions give left-anti joins

Note
Prefer EXISTS (over Not with In with ListQuery subquery expression) if performance matters since they say “that will almost certainly be planned as a Broadcast Nested Loop join”.

RewritePredicateSubquery is part of the RewriteSubquery once-executed batch in the standard batches of the Catalyst Optimizer.

RewritePredicateSubquery is simply a Catalyst rule for transforming logical plans, i.e. Rule[LogicalPlan].

RewritePredicateSubquery is part of the RewriteSubquery once-executed batch in the standard batches of the Catalyst Optimizer.

rewriteExistentialExpr Internal Method

rewriteExistentialExpr…​FIXME

Note
rewriteExistentialExpr is used when…​FIXME

dedupJoin Internal Method

dedupJoin…​FIXME

Note
dedupJoin is used when…​FIXME

getValueExpression Internal Method

getValueExpression…​FIXME

Note
getValueExpression is used when…​FIXME

Executing Rule — apply Method

Note
apply is part of the Rule Contract to execute (apply) a rule on a TreeNode (e.g. LogicalPlan).

apply transforms Filter unary operators in the input logical plan.

apply splits conjunctive predicates in the condition expression (i.e. expressions separated by And expression) and then partitions them into two collections of expressions with and without In or Exists subquery expressions.

apply creates a Filter operator for condition (sub)expressions without subqueries (combined with And expression) if available or takes the child operator (of the input Filter unary operator).

In the end, apply creates a new logical plan with Join operators for Exists and In expressions (and their negations) as follows:

赞(0) 打赏
未经允许不得转载:spark技术分享 » RewritePredicateSubquery
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏