关注 spark技术分享,
撸spark源码 玩spark最佳实践

JdbcRelationProvider

JdbcRelationProvider

JdbcRelationProvider is a DataSourceRegister and registers itself to handle jdbc data source format.

Note
JdbcRelationProvider uses META-INF/services/org.apache.spark.sql.sources.DataSourceRegister file for the registration which is available in the source code of Apache Spark.

JdbcRelationProvider is a RelationProvider and a CreatableRelationProvider.

JdbcRelationProvider is used when DataFrameReader is requested to load data from jdbc data source.

Loading Data from Table Using JDBC — createRelation Method (from RelationProvider)

Note
createRelation is part of RelationProvider Contract to create a BaseRelation for reading.

createRelation creates a JDBCPartitioningInfo (using JDBCOptions and the input parameters that correspond to the Options for JDBC Data Source).

Note
createRelation uses partitionColumn, lowerBound, upperBound and numPartitions.

In the end, createRelation creates a JDBCRelation with column partitions (and JDBCOptions).

Writing Rows of Structured Query (DataFrame) to Table Using JDBC — createRelation Method (from CreatableRelationProvider)

Note
createRelation is part of the CreatableRelationProvider Contract to write the rows of a structured query (a DataFrame) to an external data source.

Internally, createRelation creates a JDBCOptions (from the input parameters).

createRelation reads caseSensitiveAnalysis (using the input sqlContext).

createRelation checks whether the table (given dbtable and url options in the input parameters) exists.

Note
createRelation uses a database-specific JdbcDialect to check whether a table exists.

createRelation branches off per whether the table already exists in the database or not.

If the table does not exist, createRelation creates the table (by executing CREATE TABLE with createTableColumnTypes and createTableOptions options from the input parameters) and writes the rows to the database in a single transaction.

If however the table does exist, createRelation branches off per SaveMode (see the following createRelation and SaveMode).

Table 1. createRelation and SaveMode
Name Description

Append

Saves the records to the table.

ErrorIfExists

Throws a AnalysisException with the message:

Ignore

Does nothing.

Overwrite

Truncates or drops the table

Note
createRelation truncates the table only when truncate JDBC option is enabled and isCascadingTruncateTable is disabled.

In the end, createRelation closes the JDBC connection to the database and creates a JDBCRelation.

赞(0) 打赏
未经允许不得转载:spark技术分享 » JdbcRelationProvider
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏