关注 spark技术分享,
撸spark源码 玩spark最佳实践

JDBCRDD

JDBCRDD

JDBCRDD is a RDD of internal binary rows that represents a structured query over a table in a database accessed via JDBC.

Note
JDBCRDD represents a “SELECT requiredColumns FROM table” query.

JDBCRDD is created exclusively when JDBCRDD is requested to scanTable (when JDBCRelation is requested to build a scan).

Table 1. JDBCRDD’s Internal Properties (e.g. Registries, Counters and Flags)
Name Description

columnList

Column names

Used when…​FIXME

filterWhereClause

Filters as a SQL WHERE clause

Used when…​FIXME

Computing Partition (in TaskContext) — compute Method

Note
compute is part of Spark Core’s RDD Contract to compute a partition (in a TaskContext).

compute…​FIXME

resolveTable Method

resolveTable…​FIXME

Note
resolveTable is used exclusively when JDBCRelation is requested for the schema.

Creating RDD for Distributed Data Scan — scanTable Object Method

scanTable takes the url option.

scanTable finds the corresponding JDBC dialect (per the url option) and requests it to quote the column identifiers in the input requiredColumns.

scanTable uses the JdbcUtils object to createConnectionFactory and prune columns from the input schema to include the input requiredColumns only.

In the end, scanTable creates a new JDBCRDD.

Note
scanTable is used exclusively when JDBCRelation is requested to build a distributed data scan with column pruning and filter pushdown.

Creating JDBCRDD Instance

JDBCRDD takes the following when created:

  • SparkContext

  • Function to create a Connection (() ⇒ Connection)

  • Schema (StructType)

  • Array of column names

  • Array of Filter predicates

  • Array of Spark Core’s Partitions

  • Connection URL

  • JDBCOptions

JDBCRDD initializes the internal registries and counters.

getPartitions Method

Note
getPartitions is part of Spark Core’s RDD Contract to…​FIXME

getPartitions simply returns the partitions (this JDBCRDD was created with).

pruneSchema Internal Method

pruneSchema…​FIXME

Note
pruneSchema is used when…​FIXME

Converting Filter Predicate to SQL Expression — compileFilter Object Method

compileFilter…​FIXME

Note

compileFilter is used when:

赞(0) 打赏
未经允许不得转载:spark技术分享 » JDBCRDD
分享到: 更多 (0)

关注公众号:spark技术分享

联系我们联系我们

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏