关注 spark技术分享,
撸spark源码 玩spark最佳实践

Adaptive Query Execution

Adaptive Query Execution

Adaptive Query Execution (aka Adaptive Query Optimisation or Adaptive Optimisation) is an optimisation of a query execution plan that Spark Planner uses for allowing alternative execution plans at runtime that would be optimized better based on runtime statistics.

Quoting the description of a talk by the authors of Adaptive Query Execution:

At runtime, the adaptive execution mode can change shuffle join to broadcast join if it finds the size of one table is less than the broadcast threshold. It can also handle skewed input data for join and change the partition number of the next stage to better fit the data scale. In general, adaptive execution decreases the effort involved in tuning SQL query parameters and improves the execution performance by choosing a better execution plan and parallelism at runtime.

Adaptive Query Execution is disabled by default. Set spark.sql.adaptive.enabled configuration property to true to enable it.

Note
Adaptive query execution is not supported for streaming Datasets and is disabled at their execution.

spark.sql.adaptive.enabled Configuration Property

spark.sql.adaptive.enabled configuration property turns adaptive query execution on.

Tip
Use adaptiveExecutionEnabled method to access the current value.

EnsureRequirements

EnsureRequirements is…​FIXME

Further Reading and Watching

  1. (video) An Adaptive Execution Engine For Apache Spark SQL — Carson Wang

  2. An adaptive execution mode for Spark SQL by Carson Wang (Intel), Yucai Yu (Intel) at Strata Data Conference in Singapore, December 7, 2017

赞(0) 打赏
未经允许不得转载:spark技术分享 » Adaptive Query Execution
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏