关注 spark技术分享,
撸spark源码 玩spark最佳实践

InsertIntoDataSourceCommand

InsertIntoDataSourceCommand Logical Command

InsertIntoDataSourceCommand is a RunnableCommand that inserts or overwrites data in an InsertableRelation (per overwrite flag).

InsertIntoDataSourceCommand is created exclusively when DataSourceAnalysis logical resolution is executed (and resolves an InsertIntoTable unary logical operator with a LogicalRelation on an InsertableRelation).

InsertIntoDataSourceCommand returns the logical query plan when requested for the inner nodes (that should be shown as an inner nested tree of this node).

Executing Logical Command (Inserting or Overwriting Data in InsertableRelation) — run Method

Note
run is part of RunnableCommand Contract to execute (run) a logical command.

run takes the InsertableRelation (that is the relation of the LogicalRelation).

run then creates a DataFrame for the logical query plan and the input SparkSession.

run requests the DataFrame for the QueryExecution that in turn is requested for the RDD (of the structured query). run requests the LogicalRelation for the output schema.

With the RDD and the output schema, run creates another DataFrame that is the RDD[InternalRow] with the schema applied.

run requests the InsertableRelation to insert or overwrite data.

In the end, since the data in the InsertableRelation has changed, run requests the CacheManager to recacheByPlan with the LogicalRelation.

Note
run requests the SparkSession for SharedState that is in turn requested for the CacheManager.

Creating InsertIntoDataSourceCommand Instance

InsertIntoDataSourceCommand takes the following when created:

赞(0) 打赏
未经允许不得转载:spark技术分享 » InsertIntoDataSourceCommand
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏